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Abstract. The perfect phylogeny problem is a classic problem in com- 
putational biology, where we seek an unrooted phylogeny that is com- 
patible with a set of qualitative characters. Such a tree exists precisely 
when an intersection graph associated with the character set, called the 
partition intersection graph, can be triangulated using a restricted set 
of fill edges. Semple and Steel used the partition intersection graph to 
characterize when a character set has a unique perfect phylogeny. Bor- 
dewich, Huber, and Semple showed how to use the partition intersection 
graph to find a maximum compatible set of characters. In this paper, we 
build on these results, characterizing when a unique perfect phylogeny 
exists for a subset of partial characters. Our characterization is stated in 
terms of minimal triangulations of the partition intersection graph that 
are uniquely representable, also known as ur-chordal graphs. Our char- 
acterization is motivated by the structure of ur-chordal graphs, and the 
fact that the block structure of minimal triangulations is mirrored in the 
graph that has been triangulated. 



1 Introduction 

An X—tree is a pair T x — (T, (f>) where T is a tree and <f> is a map from X to 
the nodes of T, such that every node of T with degree two or one is mapped to 
by <f>. We will call the range of <fi the labeled nodes of T x , and these nodes are 
labeled by <p. The underlying tree of T x is T. An X—tree is free if is a bijection 
to the leaves of T, and it is ternary if every internal node of T has degree three. 
Given A C X, we will use T X (A) to denote the minimal subtree of T containing 
the nodes <p(A). Two subtrees %(A) and T X (A') of T intersect if they have one 
or more nodes in common, and if v is a common node of T X (A) and T X (A') we 
say that T X (A) and T X {A') intersect at v. 

A partial character for X is a partition \ = A\ \ A 2 1 . . . | A r of a subset X' C X. 
Each Ai is called a cell of %. If T x (A) and T X (A') do not intersect for every pair 
of distinct cells A and A' of \, then T x displays \- A perfect phylogeny for a set 
of partial characters C is an X—tree T x that displays each character in C. When 
C has a perfect phylogeny, we also say that C is compatible. The perfect phylogeny 
problem (also called the character compatibility problem) is to determine if a set 
of partial characters is compatible. 
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Fig. 1. An X— tree for X — {a, b, c, d, e} and a non-chordal partition intersection graph 
int(C). The characters C = {xiiX2,X3j are Xi = ab\cd, \2 = ac\bde, and X3 = ab|de. 
Edges of int(C) are given by solid edges, and the dashed edge represents the fill edge 
required to obtain the triangulation int(C,7^) of int(C). Both xi and X3 are displayed 
by T x , but X2 is not because T x {ac) and T x (bde) intersect at u and v. This intersection 
induces the dashed fill edge of int(C), which breaks X2- The edge uv is distinguished 
by Xi- 



The perfect phylogeny problem reduces to a graph theoretical problem that 
we detail now. Given a set of characters C, one can construct the partition in- 
tersection graph int(C) as follows. The vertex set of int(C) is 

{ (A x) I X G C an d A is a cell of x} , 

and there is an edge between two vertices (A, x) an d (A',x') if an d only if A 
and A' have non-empty intersection. For a vertex (A,x) °f int(C), A is the cell 
of (A, x) and x is the character of (vl,x). Observe that if x = Al^bl ■ ■ • \A r 
is a partial character, then every pair of distinct vertices (A,x) an d (A',x) are 
non-adjacent in int(C). 

A graph is chordal if every cycle of length four or more has a chord, that is, an 
edge between vertices of the cycle that do not appear consecutively in the cycle. 
In general int(C) is not a chordal graph, and we are interested in adding edges to 
int(C) to obtain a chordal supergraph H of int(C) that is called a triangulation 
of int(C). The added edges are called fill edges. If no subset of the fill edges yields 
a triangulation of int(C), it is a minimal triangulation of int(C). When each fill 
edge is of the form (A, x)(A', x') an d X 7^ x! 1 the resulting triangulation is a 
proper triangulation of int(C). The following classic result reduces the question 
of determining compatibility to finding proper triangulations of the partition 
intersection graph. It was originally phrased in terms of proper triangulations, 
but from the definitions it follows that int(C) has a proper triangulation if and 
only if it has a proper minimal triangulation. 

Theorem 1. |6, 21 , ~26^ Let C be a set of qualitative characters. Then C is 



compatible if and only i/int(C) has a proper minimal triangulation. 

Two X~ trees T x and 7^ are isomorphic, writing T x — T x , if there is a bijective 
map %j) : V(T) — > V(T') that has the following properties: 
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1. it preserves labels, meaning that <f>' = ip o and 

2. it is a graph isomorphism, that is, uv € E(T) if and only ifijj(u)tp(v) G E{T'). 

A set of characters C defines a perfect phylogeny if it is the unique perfect 
phylogeny, up to isomorphism, for C. The unique perfect phylogeny problem is 
to determine if a set C of partial characters defines a perfect phylogeny. If T x = 
(T, </>) displays \ an d uv is an edge of T such that u is a node of % {A) and v is 
a node of T X (A') where A and A' are distinct cells of %, then uv is distinguished 
by x- If every edge of % is distinguished by at least one character of C, then T x 
is distinguished by C. The following characterization is due to Semple and Steel. 

Theorem 2. Let C be a set of partial characters on X. Then C defines a 

perfect phylogeny if and only if the following conditions are satisfied: 

(a) int(C) has a unique proper minimal triangulation H ; and 

(b) there is a free ternary perfect phylogeny for C and it is distinguished by C . 

Further, if T x is the unique perfect phylogeny for C, then T x is a free ternary 
X—tree distinguished by C, and int(C,T x ) = H ■ 

This result is the impetus of our current work, and one of our main interests 
is to re-formulate condition (b) in terms of combinatorial structures that play a 
significant role in the study of chordal graphs and minimal triangulations. 

Chordal graphs are characterized by the existence of trees that represent the 
adjacency structure of the graph. Suppose G is a graph with vertices V(G) = 
{xi, X2, ■ ■ ■ , x n }. A tree representation of G consists of a tree T and subtrees 
Xi, T 2 , . . . , T n of T such that two trees T, and Tj intersect if and only if Xi and 
Xj are adjacent. Here, the subtrees are in one-to-one correspondence with the 
vertex set of G, and this correspondence is made explicit by mapping each subtree 
Tj to the vertex Xi of G. Observe that a node v of T defines a clique JC(v) = 
{xi | v is a node of Tf\ of G. Notationally, we will write a tree representation as 
an ordered pair % = (T, K.) where K. maps nodes of T to cliques of G satisfying 
the following properties: 

(Edge Coverage) a pair of vertices x and y of G are adjacent if and only if 

there is a node v of T such that x, y € IC(v); and 
(Convexity) for each vertex x of G, the set of nodes {v G V(T) | x g IC(v)} 

induces a subtree of T (i.e. a connected subgraph of T). 

We will frequently refer to the convexity property throughout the paper. As 
with A— trees, we will call T the underlying tree of T r . Often we will define a 
tree representation by only specifying the underlying tree T and a collection of 
subtrees of T, which together implicitly define K. 

Let Tx, T 2 , . . . , Tfc be a collection of subtrees of T. If each pair Tj, Tj of sub- 
trees intersect at a node Vij , then by the Helly property for subtrees of a tree [9] , 
all of Ti, T2, . . . , Tk intersect at a common node v. This property manifests itself 
as a statement about cliques of G and nodes of T r in the following way: for 
any clique K of G, there is at least one node u of T such that K C JC(u), In 
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particular, this is true when if is a maximal clique of G (i.e. no proper superset 
is also a clique), and therefore K,(V(T)) contains the set of maximal cliques of 
G. If the maximal cliques of G are in one-to-one correspondence with the nodes 
of T via IC, then T r is a clique tree of G. See Figure [2] for an example. 

Theorem 3. [6, 8, 27] The following statements are equivalent. 



(a) G is a chordal graph. 

(b) G has a tree representation. 

(c) G has a clique tree. 

Observe that if T r = (T, IC) is a clique tree of G and uv is an edge of T, 
then because JC{u) and IC(v) are both maximal cliques of G, there is a vertex 
x of G in JC{u) — JC(v). In general, a chordal graph has an exponential number 
of clique trees. An algorithm to enumerate clique trees, along with a formula to 
count them, appears in [16] . 

We will often be analyzing a tree representation T r = [T, IC) of a triangulation 
H of int(C). Given a vertex [A, x) of int(C), we will denote the subtree of T that 
it corresponds to by % (A, \)- Observe that v is a node of %(A, x) if an d only if 
(A, x) € JC(v). Given a set of characters C on X and an X— tree T x = (T, (/)), a 
chordal graph int(C, T x ) is given by adding an edge between two vertices {Ai,xx) 
and (A 2 , Xz) if an d only if T X {A\) and T X (A 2 ) intersect. This construction, along 
with the fact that int(C,7^) is a triangulation of int(C), is well-known in the 
phylogenetics literature, and will be discussed in detail in Section 2. 

A chordal graph G is uniquely representable if it has a single clique tree, or 
ur-chordal for short. A ur-chordal graph is ternary if each internal node of its 
clique tree has degree three, and its Zea/a^d^ is the number of leaves its clique 
tree has. Let H be a proper triangulation of int(C) and T r = (T,K.) be a clique 
tree of H. An edge uv of% is incontractable with respect to x if there are distinct 
cells A and A' of x sucn that u G %(A, x) an d v £ %(A', x)- We say that T r is 
incontractable with respect to C if each edge is incontractable with respect to at 
least one ^gC. Now we present our first main result. 

Theorem 4. Suppose C is a set of partial characters on X. Then C defines a 
perfect phytogeny if and only if the following conditions hold: 

(a) int(C) has a unique proper minimal triangulation H; 

(b) H is a ternary ur-chordal graph with leafage \X\; and 

(c) each edge of H 's unique clique tree is incontractable with respect to C . 

Further, if T x is the perfect phytogeny defined by C, then T x is a free ternary 
X—tree distinguished by C, and int(C,7i) = H . 

Let C be a set of partial characters on X and x G C. Suppose H is a triangu- 
lation of int(C) with fill edge (A, x)(^4', x) where A and A' are distinct cells of x- 



1 In general, the leafage of a chordal graph is the minimum number of leaves that a 



clique tree of the graph can have 19 
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Then we say that (A, x)(A', \) breaks x> arL d x is a broken character of H. For 
a triangulation H of int(C), its displayed characters are the characters of C that 
are not broken characters of H. Bordewich, Huber, and Semple [3] proved that 
it is possible to find a maximum-sized compatible subset of C using the partition 
intersection graph. 

Theorem 5. LetC be a set of partial characters on X. ThenC is a maximum- 
sized compatible subset of C if and only if there is a triangulation H o/int(C) 
that has C as its displayed characters, and any other triangulation o/int(C) has 
at most \C'\ displayed characters. 

A subset of partial characters C C C is a maximal defining subset of C when 
C defines a perfect phylogeny, and there is no compatible set C" such that 
C C C" C C. Our second main result is the following. 

Theorem 6. Suppose C is a set of partial characters on X and C C C. Then 
C is a maximal defining subset of C if and only if the following conditions hold: 

(a) int(C) has a unigue minimal triangulation H that has C as its displayed 
characters, and no other minimal triangulation of int(C) has at least C as 
its displayed character set; 

(b) H is a ternary ur-chordal graph with leafage \X\; and 

(c) each edge of H 's unique clique tree is incontractable with respect to C . 

Further, if T x is the perfect phylogeny defined by C , then T x is a free ternary 
X—tree distinguished by C , and int(C, T x ) = H . 

2 Chordal Graph Preliminaries 

In this section, we detail known results on chordal graphs that are necessary for 
the remainder of the paper. Suppose G = (V, E) is a graph and Scy. Let G — S 
denote the graph obtained from G by removing S and all edges incident to at 
least one vertex in S. If there are vertices x and y of G that are connected in G 
but not in G — S, then S is an xy— separator, and if no proper subset of S has 
this property, it is a minimal xy— separator. When S is a minimal xy— separator 
for at least one pair of vertices x and y, it is a minimal separator. Minimality 
in this definition is relative; it is possible to have containment relationships 
between two minimal separators^ The maximal connected subsets of G — S are 
the connected components of G — S. Let C be a connected component of G — S. 
The neighborhoood of C in G, denoted N(C), is the set of vertices of S that are 
adjacent to at least one vertex in C. If N(C) = S, then it is a full component of 
G—S. The following useful characterization of minimal separators is well-known, 
and left as an exercise in [9j. 

2 Dirac [7] called these sets relatively minimal cut-sets, which is perhaps more descrip- 
tive, but this term has stuck of the modern literature on chordal graphs and minimal 
triangulations. 
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(a) G (b) % 

Fig. 2. A chordal graph G and clique tree % = (T, K.) of G. There are three clique trees 
for G, one of which is obtained by removing the bottom-leftmost node and its incident 
edge, and then attaching it to v. The maximal clique map K, is defined by the triangle 
drawn inside each node. Arrows indicate vertices that consist of the intersection of a 
neighboring node's maximal clique, and each such intersection is a minimal separator 
by Theorem [7| 



Lemma 1. Let G = {V, E) be a graph, and S C V. Then S is a minimal 
separator if and only if G — S has two or more full components. 

A minimal separator has multiplicity k if G — S has k — 1 full components. 
Interestingly, clique trees contain detailed information about the minimal sep- 
arators of the graph it represents, which will be useful for our proofs in later 
sections. 

Theorem 7. fTfijfl^ / Suppose G — (V, E) is a chordal graph, % = (T, K) is a 
clique tree of G, and S C V. Then S is a minimal separator of G if and only if 
there is an edge uv of T such that S = K(u) fl /C(v^J Further, the multiplicity 
of S is the number of edges of T with this property. 

We will not need all of the following characterizations of ur-chordal graphs, 
but we list them for completeness. 

Theorem 8. '14, 181 Let G be a chordal graph. Then the following statements 
are equivalent. 

a) G is uniquely representable. 

b) If S is a minimal separator of G, then there are exactly two maximal cliques 
K and K' of G such that SCK,K'. 

c) Each minimal separator of G has multiplicity one. 

3 This theorem also implies that a minimal separator of a chordal graph is a clique. 
In fact, chordal graphs are characterized by having only clique minimal separators, 
which is one of the earliest results on chordal graphs |7| . 
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d) There is no minimal separator of G that properly contains another minimal 
separator of G. 

e) The number of minimal separators of G is the number of maximal cliques of 
G minus one. 

Chordal graphs can be recognized in linear time [2] , and the maximal cliques 
and minimal separators of a chordal graph may be computed in linear time [l]. 
Using property e) of Theorem [8j this allows the recognition of ur-chordal graphs 
in linear time. 



3 X— trees, Clique Trees, and Tree Representations 

In this section, we define two operations to facilitate the discussion between 
X— trees and tree representations, and prove or state results that will be useful in 



later sections. Both operations are commonly used in the literature (see 24 26]), 
but do not seem to be named as we will do here. The results in this section will 
be useful for proving our characterization for maximal defining subsets. 

Given a set of characters C on X and an X— tree T x = (T, 0), construct the 
chordal graph int(C,Ti) having: 

• vertex set identical to that of int(C); and 

• an edge between (A, x) and (A 1 , \') if and only if T X (A) and T X (A') intersect. 

This graph has a tree representation T r with underlying tree T and subtrees 
obtained by defining % (A, x) — T X {A) for each vertex (A, x) 01 int(C). Therefore 
int(C) is chordal by Theorem [3j Each edge (Ax, xi)(A2, X2) of int(C) has cells 
Ai and A2 that share at least one member of X, say a, so T x (Ax) and T X (A2) 
intersect at 0(a). Therefore T r (Ax,xi) and T r (Ai,Xi) intersect at cj>(a) as well, so 
(Ax, xi)(A 2 , X2) is an edge of int(C, T x ) by subtree intersection. Hence each edge 
of int(C) is also an edge of int(C,7i), so int(C,7^) is a triangulation of int(C). 
We will say that T x derives the tree representation T r = (T, K,) of int(C, T x ) and 
Tr is derived from 7~ x . In general, T r is not a clique tree. 

Observation 1 LetC be a set of partial characters on X , T x be an X— tree, and 
T r be the tree representation induced by T x . Then the underlying tree of T r is 
the underlying tree ofT x , and for all x G C and cells A of x, T X (A) = T r (A, x)- 
Further, int(C,7i) is a triangulation o/int(C). 

Lemma 2. Let C be a set of partial characters on X , and T x be an X—tree that 
displays C C C. Suppose that each edge of T x is distinguished by C . Then the 
tree representation of\nt(C,T x ) derived from T x is a clique tree of vak(C ,T X ) ■ 

Proof. Let %. = (T,/C) be the tree representation of int(C,7^) derived from T x . 
For the sake of contradiction, assume that T T is not a clique tree, so that JC is not 
a one-to-one map between the nodes of T and the maximal cliques of int(C, T x )- 
Then there must be nodes u and v of T such that JC(v) C K(u). Let v' be the 
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Fig. 3. The tree representation T r of int(C,7i) derived from T x in Figure [T]a. The 
triangulation int(C,T x ) of int(C) is depicted in Figure [I]b; note that any two subtrees 
of T r intersect precisely when the corresponding vertices of int(C,T x ) are adjacent. 
Observe that Tr is not a clique tree; for example, the two left-most nodes map to non- 
maximal cliques. Additionally, there are two nodes that map to the maximal clique 
{(cd,xi), ipde,X'2), (de,X3)}- Obtaining a clique tree from a tree representation is de- 
scribed in [§]. 



closest node to v between u and v in T, allowing the possibility that v' = u. Note 
that each vertex in JC(v) is also a vertex of K.(v') by convexity, so lC(v) C JC(v'). 

Now, vv' is distinguished by some \ € C, so there are distinct cells A and A' 
of x such that v is a node of T X {A) and v' is a node of T X (A'). By Observation 
[I] v is a node of T r (A,x), so (A, x) € /C(w) and by containment (A,x) & K{v'). 
Further, (A' , x) G A^(w'), so (A, x)(^4', x) is a nn edge of int(C, T x ) and it breaks 
X- This contradicts the assumption that T x displays C , so T r must be a clique 
tree of int(C, T x ). □ 

Lemma 3. Lei C be a set of partial characters on X and T x be an X—tree 
that displays C C C. Suppose that T x is free, ternary, and each edge of T x is 
distinguished by C . Then int(C,7^) is uniquely representable. 

Proof. To prove that int(C, T x ) is uniquely representable, we use Theorem [8] and 
show that there are no containment relationships between the minimal separators 
of int(C, T x ). Working towards a contradiction, assume that S C S' are minimal 
separators of int(C,7^), and let % — (T,fC) be the tree representation derived 
from T x ■ Then T r is a clique tree by Lemma [2j and there are edges uv and 
u'v' of T such that S = JC{u) n JC{v) and S' = fC(u') n JC{v') by Theorem [7] 
Without loss of generality, assume that the path from v to v' does not contain 
cither u or u' (perhaps v — v'). Let w be the node on this path adjacent to v 
if v 7^ v' , otherwise let w = u'. In either case, K{w) contains S: if w = u' , then 
S C JC(w). Otherwise, each (A,x) in S is an element of both lC(u) and /C(it'), 
so (^4,x) € /C(ii;) by convexity, and S C /C(w). 

To complete the proof, we will obtain a contradiction by showing that S 
has a vertex not in )C(w). There is a character x' i n C' that distinguishes iod, 
and distinct cells A' and A" of x' such that v is a node of T X (A') and w is a 



S 



node of T X (A"). Now, v has at least two neighbors, and because T x is ternary, 
v must have degree three. Also, v is not mapped to by <j) because T x is free, 
so in order for v to be a node of T X (A'), there must be at least two nodes 
of T x in 4>{A') that are not v, and the path between these two nodes must 
contain v. This path must also contain two of v's neighbors, and neither of 
these vertices can be w, because w is not a node of T X (A'). Thus u must be 
on this path, so it is a node of T X (A'). Both JC(u) and JC(v) contain (A',x') by 
Observation[l] so it must be a vertex of S — K(u)r\JC(v). Further, (A' , x') 4- K-( w ) 
because w is not a node of T X (A') = T r (A',x')- This is impossible because we 
have shown that both (A',x') <= S — JC(w) and S C JC(w). Thus there are 
no containment relationships between the minimal separators of int(C,7^), and 
int(C,7i) is uniquely representable by Theorem [8] □ 

Lemma 4. £|/ Let C be a set of partial characters on X, T x an X—tree, and 
C be the subset of C displayed by T x . Then int(C, T x ) is a triangulation o/int(C) 
in which the displayed characters are C . 

The previous three lemmas can be summarized as follows. 

Theorem 9. Let C be a set of partial characters on X and T x be an X—tree 
that displays C C C. Suppose that T x is free, ternary, and each edge of Tx is 
distinguished by C . Then mt(C,T x ) is a uniquely representable chordal graph, 
and the tree representation derived from T x is its unique clique tree. Further, the 
displayed characters of mt(C,T x ) are exactly C . 

Now suppose that % = (T, K.) is a clique tree of a triangulation H of int(C), 
with the goal of defining an X—tree T x - The discussion we provide here is stan- 
dard, e.g. see 3p5 . Construct a map <f> from X to T by defining, for each 



4>{a) = v if and only if fC{v) contains every vertex of int(C) whose cell contains 
a. These vertices form a clique because a is contained in each of their cells. Be- 
cause we have only added fill edges to obtain H , this clique a subset of a maximal 
clique of H, and hence v exists. There may be more than one choice for v, each 
of which we call a candidate node for a. Let u be a leaf of T with neighbor w. 
Then K{u) is a maximal clique that contains a vertex (A,x) of int(C) that is 
not found in JC(w). By convexity, u is the only node of T whose corresponding 
maximal clique contains (A, x)- Each a' £ A has u as its unique candidate node, 
and hence every leaf of T is a unique candidate node for at least one element of 
X. Thus each leaf of T must be labeled by <j). To finish constructing an X—tree, 
obtain T' by suppressing any unlabeled nodes of T that have degree two. The 
result is an X—tree T x = (T' , </>), and we say that T r induces T x and T x is induced 
by %. ■ We emphasize that the underlying tree of T x need not be the same as the 
underlying tree of T r - Note that, because an element of X may have multiple 
candidate nodes, T r may induce multiple X— trees. Next, we show that when 
H is a minimal triangulation, much of 7i's structure is described by T r - The 
following lemma will be useful. 

Lemma 5. 11 20j Let G be a graph and H be a minimal triangulation of G. 



If uv is a fill edge of H , then there is a minimal separator of H that contains 
both u and v. 
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(a) V 



(b) % 



Fig. 4. A clique tree l~l of int(C,T x ) from Figure [T]b and an X— tree TJ. induced by 
Tr . Note that int(C,7^') = int(C,7^), where T x is the X— tree from Fig.[l]a. 



Though not stated in this form, the following lemma follows from the proof 



of Lemma 2.4 and the statement of Corollary 2.5 in 24 



Lemma 6. Let H be a minimal triangulation of'mt(C), and suppose l~ x is in- 
duced by a clique tree of H. Then H = int(C,7^). 

Lemma 7. Let H be a minimal triangulation o/int(C), T r — (T,K.) be a clique 
tree of H, and suppose T r induces T~ x . Then the underlying tree ofT x is T. 

Proof. We have already seen that every leaf of T is the unique candidate node 



of some element of X. In addition to this, it was also shown in 13 that if it is a 
node of T r of degree two, then u is the unique candidate node of some element 
of X. This was done by showing that IC(u) contains either: 

1. a vertex (A\, \i) of int(C) that is not contained in IC(w) for any other node 

w 7^ u of T; or 

2. an edge (A 2 , X2X-A3, Xa) of int(C), whose incident vertices have cells with 
non-empty intersection, and are not both contained in JC(w) for any other 
node w =/= u of T. 

For completeness, we outline a proof here. Using convexity and the fact that 
u has degree two, it follows that either a unique vertex or unique pair of vertices 
are contained in }C{u). It remains to show that (A2, X2)(^3, X3) is actually an 
edge of int(C) (so A2 n A3 is non-empty). If not, then by Lemma [5] there is a 
minimal separator S of H containing both (A2,X2) and (^3,X3)- By Theorem 
[7j there is an edge U1U2 of T such that S = JC{ui) n TC{u2)- But this contradicts 
case 2, so it must be that (A 2 , X2XA3, X3) i s an ec te e °f hit(C). 

In both cases, u is the unique candidate node of some element in X (this 
element is either a e A\ or a e A 2 n A 3 ), so every degree two node of T is 
labeled by and there are no nodes of T that need to be suppressed. Hence the 
underlying tree of T x is T. □ 
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Lemma 8. Let H be a minimal triangulation o/int(C), T r — (T,IC) be a clique 
tree of H, and suppose %■ induces T x . Then for each vertex [A, \) of int(C), 
T x (A) = T r (A,x). 

Proof. Let (A, x) be a vertex of int(C) and consider a node v of T X (A). Either 
v = 4>{a) for some a G A or v lies between 4>(ai) and 4>( a 2) for some a\,a2 £ A. 
In the first case, (A,x) € 1C{v) because v is a candidate node for a. Similarly, 
in the second case, (A,x) is an element of both JC(4>(ai)) and /C(0(a2)), and 
therefore (A, x) & K,(v) by convexity. In both cases v is a node of T r {A, x)> so 
%(A)cTAA,x). 

To finish proving equality, suppose that T X (A) C T r (A,x)- Define a tree 
representation = (T',K,') of a graph H' as follows: set T' = T, and define 
subtrees Tj.{A! ', x') of T for each vertex (A', x') of int(C) as follows: 

1. V(A',x')=T r (A', X ') if (A'.xO ^ (A, X ), and 

2. T r '(A',x') =T a (^) if {A',x') = (A, X ). 

We have already seen that T 7 '.{A',x') Q T r (A' ,x!) for every vertex (A',x') of 
int(C), so the edge set of i?' is a subset of the edge set of H. If Xi)(^2, X2) 
is an edge of int(C), then A\ and A2 have at least one element a in common, 
and thus T X {A\) and 7^(^42) intersect at <f)(a). Further, T X (A') C Tr(A',x') f° r 
each vertex (A', x') of int(C), so Tl{A\, xi) an d 77(^2, X2) also intersect at (f>(a). 
Therefore (Ai, xi)(A2, X2) is an edge of H 7 , and iJ' is chordal by Theorem [3j so 
it is a triangulation of int(C). 

To complete the proof, we show that H' must have an edge that does not 
exist in H. Because T X {A) C %(A, x), there is a node u of %(A, x) — T X {A) that 
is adjacent to a node w of 7^ (A) = 77 (A x)- By maximality, there is a vertex 
(A',x') € /C(u) — IC(w), and because u is a node of 7^(A x), € /C(u). 

Therefore (A', x')(A, x) is an edge of i7. The situation in H' is different: if 
(A', x')(A, x) is an edge of H' , then TJ(A, x) and T^.{A',x') intersect at a node 
w'. But if v' is a node of T^{A, x), then w is on the path from v' to u, and by 
convexity this would imply that (A',x') £ K(w). Hence (j4', x')(A x) i s n °t an 
edge of H' , so the edge set of ff' is a proper subset of the edge set of H . This 
is impossible because H is a minimal triangulation of int(C), so it must be that 
T X {A) = T r (A, x) for each vertex (A, x) of int(C). □ 

Lemmas [6j [7J and [8] are summarized below. 

Theorem 10. Let H be a minimal triangulation o/int(C), T r be a clique tree of 
H , and suppose 7~ r induces T x ■ Then the underlying tree of T x is the underlying 
tree of T r , and for each vertex (A,x) of int(C), T X (A) = T r (A,x)- Further, 

4 Maximal Defining Subsets of Characters 

This section is devoted to the proof of Theorem [6] Its proof will follow mainly 
from Propositions [l] and [2j Recall that, for a graph H and a subset U of its 
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vertices, the graph H — U is obtained by removing the vertices in U and edges 
of H incident to one or more vertices of U. 

Lemma 9. Let C be a set of partial characters and C C C . Suppose H' is a 
minimal triangulation o/int(C), and let U be the vertices o/int(C) that are not 
vertices o/int(C). Then there is a minimal triangulation H of mt(C) such that 
H' = H — U. 

Proof. Let H' be a minimal triangulation of int(C') and H* be the graph ob- 
tained by adding the following fill edges to int(C): 

1. the fill edges of H'; and 

2. fill edges of the form (A, x)(^4', x') where (A, x) is a vertex of U, and (A' , x') 

is any vertex of int(C). 

First we prove that H* is chordal, and then we will use it to construct H, a 
minimal triangulation of int(C) such that H — U = H' . 

Let (Ax,Xi),(A 2 ,X2),---,(A k ,Xk) be a cycle in H* . If X i € C for all i = 
1,2, ... ,k, then this cycle is also a cycle of H' , and therefore has a chord that is 
an edge of H' . Each edge of H' is also an edge of H* , so this cycle has a chord 
in H* . Otherwise, without loss of generality, xi € C — C and (A\,Xi) G U so 
either (Ai,Xi)(^4.3>X3) is an edge of int(C) or is a fill edge of H* of type 2. In 
either case, this cycle has a chord, so H* is chordal. 

Now let H be any minimal triangulation of int(C) such that the edge set of 
H is a subset of the edge set of H* . Every edge of H — U is either an edge of 
int(C') or is an edge of H' by the construction of H* and H . Therefore the edge 
set of H — U is a subset of the edge set of H' . Further, H — U is chordal because 
any cycle of H — U is a cycle of H (i.e. chordality is inherited [23]), so it is a 
triangulation of int(C). By minimality of H' , it must be that the edge set of 
H — U is equal to the edge set of H' , so H' = H — U. □ 

Lemma 10. see also Let C be a set of partial characters on X and 

suppose thatC is a compatible subset ofC. Then there is a minimal triangulation 
of hit (C) whose displayed characters are at least C . 

Though not stated in this form, the following lemma is a direct result of 
Lemma 5.1 in 3 and its proof. 

Lemma 11. Let C be a set of partial characters on X. Suppose H is a triangu- 
lation o/int(C) with displayed characters C . Then if T x is induced by a clique 
tree of H , it is a perfect phytogeny for C . 

Proposition 1. Let C be a set of partial characters on X and C C C. Suppose 
that the following conditions hold: 

(i) int(C) has a unique minimal triangulation H that has C as its displayed 
characters, and no other minimal triangulation of int(C) has at least C as 
its displayed character set; 

(ii) H is a ternary ur-chordal graph with leafage \X\; and 
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(iii) each edge of H's unique clique tree is incontractable with respect to C . 
Then C is a maximal defining subset of C. 

Proof. We begin by showing that C has a unique perfect phylogeny using The- 
orem [2j and finish the proof showing that no superset of C has a unique perfect 
phylogeny. 

Throughout the proof, H will denote the unique minimal triangulation of 



int(C) given by (i) whose displayed character set is C . By Lemma 11 C has a 
perfect phylogeny so it is compatible. There is a proper triangulation of int(C') 
by Theorem [l] so int(C') has a proper minimal triangulation as well. To see that 
condition (a) of Theorem [2] holds, suppose that H[ and H 2 are proper minimal 
triangulations of int(C') given by Theorem [I] and let U be the set of vertices of 
int(C) not in int(C'). By Lemma[9j there are minimal triangulations H\ and H 2 
of int(C) that satisfy H[ = H\ — U and H 2 = H 2 — U. The displayed characters of 
Hi and H 2 must be at least C , because any fill edge that breaks a character of C 
would also appear in H[ or H' 2l and both H[ and H' 2 are proper triangulations 
of int(C'). By (i), we have #1 = H = H 2 . Therefore H[ = H - U = H 2 , so 
condition (a) of Theorem [2] is satisfied with respect to C . 

Now we show condition (b) of Theorem [2] holds. Let T r be the unique clique 
tree of H given by (ii), and suppose T x is the X— tree induced by %■■ T x displays 



C by Lemma [TT] and by Theorem 10 the underlying tree T of T r is also the 



underlying tree of T x . Further, T is ternary and has \X\ leaves by (ii), so T x 
must be free and ternary. To see that T x is distinguished by C , consider an edge 
uv of T. By (iii) uv is incontractible with respect to C , so there is a character 
X € C and distinct cells A and A' of x such that u is a node of %(A, x) and v is 
a node of T r (A',x)- But %(A) = T r {A, X ) and %{A') = T r (A',x) by Theorem 
10 so x distinguishes uv. Hence condition (b) of Theorem [2] also holds with 



respect to C , so C defines %■ 

Last, we show that no proper superset of C also defines an X— tree. If any 



superset C* of C was compatible, then by Lemma 10 some minimal triangulation 
of int(C) has at least C* as its displayed character set. This would contradict (i), 
so no such superset can exist. This completes the proof. □ 

Lemma 12. Suppose C is a maximal defining subset of C, and H , H' are mini- 
mal triangulations o/int(C) withC as its displayed character set. Then H = H' . 

Proof. Let T x be an X— tree induced by a clique tree T r of H and 7^ be an 



X— tree induced by a clique tree Tl of H'. By Lemma 11 % and 7^ are perfect 
phylogenies for C, and because C defines an X— tree it must be that T x =T~ x via 
isomorphism -0. Additionally, for each vertex (A, x) °f mt(C) we have 7~ r (A, x) = 



T X {A) and T/(A, x ) = %{A) by Theorem [TO 

To prove that H = H' , it suffices to show that their fill edge sets are the same. 
Suppose that (A\, X1X-A2, X2) is a fill edge of H. By the edge coverage property 
of clique trees, %.{Ai,Xi) and %(A 2 ,X2) intersect at a node v of T. We will 
show that Tr(Ai,xi) and T^.{A 2l % 2 ) intersect at tp(v). If there is an a e A\ such 
that 4>{a) = v, then ip(v) — (f>'(a) is a node of T x (Ai) = 7^(Ai,xi)- Otherwise 
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there are ai,a 2 € A± and v is an internal node on the path from 4>{a{) = v\ 
to (j){ a 2) = V2- Because ip is a graph isomorphism, ^(v) is an internal node on 
the path from ip(vi) to ififa)- Further, tp( v i) and ^(^2) are nodes of T x {Ai) = 
Tr(Ai,xi), so ip(v) is a node of Tr{A\,Xi) as well. In both cases ip(v) is a node 
of 7~r(Ai,xi)t and a similar argument shows that ip(v) is a node of 7^(^-2, X%)- 
Therefore 7l(A\,xi) and 77(^2, X2) intersect at ^(v), so (A±, xi){A 2 , X2) is a 
fill edge of H' , and the fill edge set of H is a subset of the fill edge set of H' . A 
symmetric argument shows that the fill edge set of H is a subset of the fill edge 
set of H' , so these fill edge sets must be equal, completing the proof. □ 

Proposition 2. Let C be a set of partial characters on X and C be a maximal 
defining subset of C. Then the following conditions hold: 

(i) int(C) has a unique minimal triangulation H that has C as its displayed 
characters, and no other minimal triangulation of int(C) has at least C as 
its displayed character set; 

(ii) H is a ternary ur-chordal graph with leafage \X\; and 

(iii) each edge of H's unique clique tree is incontractable with respect to C . 

Further, if C defines T x , then mt(C,T x ) = H. 

Proof. To see that (i) holds, first observe that C is compatible by definition. 
There is a minimal triangulation Hi of int(C) with at least C as its displayed 



character set by Lemma 10 Because C is a maximal defining subset of C, there 



is no C C C* C C that is compatible. By Lemma 11, the displayed characters of 
Hi are compatible, so this set must be exactly C This is true of any minimal 
triangulation of int(C) that has at least C as its displayed characters. If H 2 is 
such a minimal triangulation, then Hi — H2 by Lemma [l2j so there is a unique 
minimal triangulation of int(C) that has at least C as its displayed characters. 
We will refer to this unique minimal triangulation as H in the remainder of the 
proof. 

Now we show that (ii) holds. Let T x be the X— tree induced by a clique tree 
of H. By Lemma |TTj T x is a perfect phylogeny for C , and since C is a maximal 
defining subset it must be that C defines T x - Recall that T x is free, ternary, 
and distinguished by C according to Theorem[2] By Theorem[9| int(C, T x ) is ur- 
chordal. On the other hand, H = int(C,7i) by Theorem [To[ so H is ur-chordal 
as well. By the same theorem, iJ's unique clique tree has the same underlying 
tree as T x . Since T x is ternary, this clique tree must also be ternary, so H is a 
ternary ur-chordal graph. This proves statement (ii). 

Now consider condition (iii), and let uv be an edge of T. By Theorem [2] 
the edge uv is distinguished by C , so there is a character \ G C that has cells 
A =^ A' and u is a node of T X (A) and v is a node of T X (A'). From Theorem [To] we 
see that T X (A) = %(A,x) and T X (A') = T r {A',x)i so uv ' IS incontractable with 
respect to C . Hence % is incontractable with respect to C . 

The remainder of the theorem was shown while proving (ii) holds. □ 

Proof of Theorem Propositions [l] and [2] show that C is a maximal defining 
subset of C if and only if conditions (a) - (c) hold. The fact that T x is free, 



14 



ternary, and distinguished by C follows by Theorem [5] Finally, int(C,7i) = H 
due to Proposition [2] □ 

Proof of Theorem^ Use Theorem [6] taking C =C. □ 



5 Discussion 

We conclude with a brief discussion on the role minimal separators play in min- 



imal triangulation theory 15 , and how our characterization may contribute to- 
wards constructing an algorithm that sometimes finds a maximal defining subset 
of characters when one exists. Minimal triangulations have been characterized 
by their minimal separators, which happen to be minimal separators of the 



triangulated graph as well 17 22 . Further, a minimal separator of a minimal 
triangulation has connected components (and full components) that are identical 
in the graph that has been triangulated [15] . 

Bouchitte and Todinca [4][5] used minimal separators and potential maxi- 
mal cliques, the maximal cliques of minimal triangulations, to create a dynamic 
programming algorithm to solve the treewidth and minimum-fill problems in 
time polynomial in the number of minimal separators of a graph. This approach 
was extended to create a dynamic programming algorithm that solves a vari- 
ety perfect phylogeny problems in [11] , including the unique perfect phylogeny 
problem. 

Our results elucidate the structure of minimal separators of triangulations 
associated with maximal defining subsets of characters. This structure is re- 
tained in the partition intersection graph, and is closely related to the structure 
of potential maximal cliques, because the connected components obtained by 
removing the vertices in a potential maximal clique have neighborhoods that 
are minimal separators [I]. This may allow for the computation of a ternary 
ur-chordal minimal triangulation in time polynomial in the number of minimal 
separators of int(C) (or asserting that no ternary ur-chordal minimal triangula- 
tions exist), yielding a candidate subset C of C that may be a maximal subset 
of characters. The number of minimal separators of int(C') is bounded by the 
number of minimal separators of int(C) (this is a specific example of a more 
general fact; see Corollary 4 in [5]). Therefore if it is computationally feasible to 
find C due to int(C) having a small number of minimal separators, checking if 



C defines a perfect phylogeny using the method from 11 may also be feasible. 
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